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ABSTRACT 

A  stratification  and  manifold  learning  approach  for  analyz¬ 
ing  High  Angular  Resolution  Diffusion  Imaging  (HARDI) 
data  is  introduced  in  this  paper.  HARDI  data  provides  high¬ 
dimensional  signals  measuring  the  complex  microstructure 
of  biological  tissues,  such  as  the  cerebral  white  matter.  We 
show  that  these  high-dimensional  spaces  may  be  understood 
as  unions  of  manifolds  of  varying  dimensions/complexity  and 
densities.  With  such  analysis,  we  use  clustering  to  character¬ 
ize  the  structural  complexity  of  the  white  matter.  We  briefly 
present  the  underlying  framework  and  numerical  experiments 
illustrating  this  original  and  promising  approach. 

Key  words:  Stratification  and  manifold  learning,  DTI,  HARDI, 
complexity,  white  matter  connectivity. 

1.  INTRODUCTION 

Diffusion  MRI  is  a  powerful  extension  of  MRI  that  maps  how 
local  diffusion  affects  the  MR  signal,  in  multiple  sampling  di¬ 
rections,  providing  exquisite  insight  into  local  white  matter 
fiber  orientation.  Water  diffusion  in  the  brain  occurs  prefer¬ 
entially  along  fiber  bundles  and  is  hindered  in  orthogonal  di¬ 
rections,  reflecting  brain  architecture  at  a  microscopic  scale. 
In  the  Diffusion  Tensor  (DT)  model  m,  a  tensor  describes 
local  3D  diffusion  as  the  3x3  covariance  matrix  of  a  Gaus¬ 
sian  distribution,  modeling  the  averaged  diffusion  properties 
of  water  molecules  (in  a  typical  l-3mm  sized  voxel).  High 
Angular  Resolution  Imaging  (HARDI)  overcomes  limitations 
of  DTI  for  characterizing  complex  tissue  geometries  such  as 
fiber  crossing,  measuring  diffusion  along  30-100  or  more  di¬ 
rections  uniformly  distributed  on  the  sphere.  From  this  high¬ 
dimensional  signal  Sj (x),j  =  1 , . . . ,  n,  where  x  £  Q  C  R3 
is  a  voxel  of  interest  and  Q  the  acquisition  grid,  spherical 
functions  such  as  the  ADC  profile  or  the  Orientational  Dis¬ 
tribution  Function  (ODF)  may  be  approximated  using  a  mod¬ 
ified  spherical  harmonic  (SH)  basis  (5)-  The  ODF  provides 
a  non-parametric  model  of  fiber  distribution,  and  is  the  radial 
projection  of  the  underlying  probability  density  function  for 
molecular  motion. 

Analyzing  the  structure  of  such  complex  datasets  will  lead 
to  a  better  understanding  of  brain  tissue  microstructure  and 
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connectivity.  Methods  such  as  Him  ED  have  been  proposed 
to  characterize  the  anisotropy  of  tissues  and  differentiate  be¬ 
tween  isotropic,  mono-  and  multi-fiber  configurations,  from 
the  SH  expansion  or  full  profile  of  the  ADC  and  from  ODFs 
132.  These  techniques  consider  each  voxel  independently 
and  do  not  try  to  explain  the  dataset’s  global  structure. 

Concepts  from  Riemannian  geometry  (e.g.,  mu  and  ref¬ 
erences  therein)  and  manifold  learning,  132,  have  been  used 
to  characterize  the  distribution  of  tensors,  perform  statistical 
analysis  and  segment  DTI,  e.g.,  using  diffusion  maps  to  clus¬ 
ter  ODF  fields  ED.  These  approaches  consider  the  elements 
of  interest  (tensors,  ODFs)  on  a  single  manifold  (e.g.,  a  sub¬ 
manifold  of  R°  in  the  tensor  case).  However,  diffusion  MRI 
data  does  not  belong  to  a  single  manifold  but  to  a  stratifica¬ 
tion,  i.e.,  the  union  of  manifolds  with  different  dimensions 
(complexities)  and  densities.  Regions  with  or  without  fiber 
crossings  clearly  belong  to  manifolds  with  different  dimen¬ 
sionality/complexity  (requiring  a  different  number  of  param¬ 
eters).  The  single  dimension  (complexity)  assumption  is  ac¬ 
curate  only  for  small  regions.  The  effort  should  then  switch 
toward  the  study  of  stratifications  GO,  of  which  manifolds  are 
a  particular  case.  Studying  the  different  manifolds  in  the  data 
may  also  indicate  the  existence  of  different  complexities  in 
the  data.  Here  we  use  stratification  to  quantify  the  local  com¬ 
plexity  of  DTI  and  HARDI  datasets  and  relate  these  findings 
to  neuro-anatomical  knowledge. 

As  we  show  experimentally,  we  can  cluster  diffusion  MRI 
datasets  by  considering  them  as  point  clouds  in  ]Rm  (m  >  6 
depends  on  the  order  of  the  SH  approximation  of  ODFs), 
without  any  spatial  knowledge.  We  show  that  the  estimated 
complexity  correlates  with  the  expected  fiber  geometry  in 
well-known  regions  of  interest. 

2.  STRATIFICATION  LEARNING 

A  framework  for  the  regularized  and  robust  estimation  of 
non-uniform  dimensionality  and  density  in  high-dimensional 
noisy  data,  i.e.,  stratifications,  was  introduced  in  Q.  High¬ 
dimensional  sample  points  are  modeled  as  a  process  of  Trans¬ 
lated  Poisson  mixtures,  with  regularizing  restrictions,  and  a 
noise  model  is  incorporated.  Levina  and  Bickel  G)  proposed 
a  geometric  and  probabilistic  method  to  estimate  the  local 
dimension  and  density  of  point  cloud  data.  In  j5j  we  showed 
how  noise  is  naturally  incorporated  in  this  model,  to  obtain  a 


dimension  estimator  which  is  robust  to  noise  (noise  brings  the 
data  outside  of  the  manifold  into  the  ambient  space,  thereby 
misleading  the  dimensionality  computation).  We  tackle  the 
more  general  case  where  the  point  cloud  data  is  a  sampling  of 
two  or  more  manifolds  of  different  dimensions  and  densities 
(a  stratification);  we  cluster  the  noisy  data  points  according 
to  these  parameters.  We  do  not  assume  subspaces  are  linear 
(e.g.,  !H),  and  we  simultaneously  estimate  the  soft  cluster¬ 
ing  and  the  intrinsic  dimension  and  density  of  the  clusters 
while  being  robust  to  noise  and  outliers. 

If  we  sample  an  m-dimensional  manifold  with  T  points, 
the  proportion  of  points  falling  into  a  ball  around  xt  is  ^  ~ 
p(xt)V(m)Ric{xt)m  Q-  The  given  point  cloud,  embedded 
in  high  dimensions  D,  is  A'  =  {xt  £  R13;  t  =  1, . . . ,  T}, 
k  is  the  number  of  points  inside  the  ball,  f>(xt)  is  the  local 
sampling  density  at  point  Xt,  V(m)  is  the  volume  of  the  unit 
sphere  in  Rm,  and  Rk(xt)  is  the  Euclidean  distance  from  xt  to 
its  k- th  nearest  neighbor  (kNN).  The  inhomogeneous  process 
N(R,xt ),  which  counts  the  number  of  points  falling  into  a 
small  D-dimensional  sphere  B(R,Xt)  of  radius  R  centered 
at  xt,  is  a  binomial  process.  Under  certain  assumptions  it  can 
be  approximated  by  a  Poisson  process  and  the  rate  A  of  the 
counting  process  N(R,xt)  can  be  expressed  as  A (R,Xt)  = 
p(xt)V  (m)mRm~1 .  The  local  intrinsic  dimension  estimator 
at  each  point  xt  is  obtained  from  the  Maximum  Likelihood 
estimator  based  on  a  Poisson  distribution  with  this  rate. 

Usually,  noise  contaminates  the  point  samples,  so  the 
observed  point  process  is  not  a  simple  sampling  of  a  low¬ 
dimensional  manifold  but  a  perturbation  of  this  sample  pro¬ 
cess.  We  model  this  with  a  Translated  Poisson  Process  Co), 
which  translates  an  underlying  (unobservable)  point  process 
into  an  output  (observable)  point  process.  The  input  and 
output  spaces  of  the  points  are  not  necessarily  of  the  same 
dimension.  An  input  point  at  location  x  in  the  input  space  X 
is  randomly  translated  to  a  location  z  in  the  output  space  Z , 
according  to  a  conditional  probability  density  f(z\x),  called 
the  transition  density.  We  consider  the  particular  case  where 
each  point  is  translated  independently  of  the  others  and  no 
deletions  or  insertions  occur  in  the  translation  process.  Then, 
any  translated  Poisson  process  with  an  integrable  intensity 
function  {A (a:)  :  x  £  A}  is  also  a  Poisson  process  with 
intensity  p,(z)  =  fx  f{z\x)\(x)dx  iflOl. 

Since  the  intensity  of  the  Poisson  process  in  our  model  is 
parametrized  by  the  Euclidean  distances  of  the  points  (and  not 
by  the  points  themselves),  we  consider  a  random  translation 
in  the  distances.  This  means  that  we  do  not  observe  the  origi¬ 
nal  distances  but  noisy  distances.  Let  f(s\r)  be  the  transition 
density  which  defines  the  random  process  which  translates  a 
distance  r  in  the  input  space  to  a  distance  s  in  the  observ¬ 
able  space.  If  A(r,  Xt)  is  the  local  rate  of  the  Poisson  process 
which  defines  the  counting  process  in  the  input  space,  then 
p(s),  the  intensity  of  the  Poisson  process  in  the  output  space, 
is  given  by  p(s,  xt)  =  J^f(s\r)p(xt)V(m)mrm~1dr.  We 
consider  R'  >  R  since  points  originally  at  distance  greater 


than  R  from  Xt  can  be  placed  within  a  distance  less  than  R 
after  the  translation  process. 

Maximizing  the  likelihood  of  the  new  Translated  Poisson 
process,  we  obtain  the  following  expression  for  the  local  di¬ 
mension  m(xt)  at  point  Xt  when  we  use  the  k  nearest  neigh¬ 
bors  (k- NN)  instead  of  the  points  closer  than  R, 

m(  1  yJ  jf  /(fli(a:t)|r)rm-1  log  ^ dr' 1 

/c_1i= i  f*  /(Riixt^ry^dr 

(1) 

where,  by  an  abuse  of  notation,  we  identify  m  =  m(xt)  in 
the  right  hand  side.  This  expression  reduces  to  the  Levina 
and  Bickel  estimator  13  when  f(s\r)  =  6(s  —  r),  i.e.,  there 
is  no  translation  of  the  original  points  (the  noise-free  case). 
Equation  ([T)  is  a  nonlinear  recursive  expression  in  m  which  is 
difficult  to  solve.  We  approximate  it  by  an  easier  to  compute 
closed  expression,  with  explicit  bounds  on  the  approximation, 


m(  xt) 


1  Jo*  f(Ri\r)  loS  l^dr 


11~i  f0n  f(Ri\r)dr 


(2) 


see  0  for  details. 

These  estimators  are  local  as  they  come  from  a  Maximum 
Likelihood  (ML)  at  each  point  xt  based  on  a  Translated  Pois¬ 
son  distribution  modeling  the  counting  process  in  the  local 
ball  B(R ,  xt).  We  compute  an  ML  on  the  whole  point  cloud 
data  at  the  same  time  (not  just  for  each  point  independently), 
based  on  a  Translated  Poisson  Mixture  Model,  which  accom¬ 
modates  noise  and  different  classes  (each  with  its  own  dimen¬ 
sion  and  sampling  density).  This  technique  gives  a  soft  clus¬ 
tering  according  to  dimensionality  and  density,  and  estimates 
both  quantities  for  each  class.  This  Translated  Poisson  Mix¬ 
ture  Model  (TPMM)  is  solved  with  an  EM  algorithm,  which 
leads  to  explicit  estimations  of  each  cluster  dimensionality 
and  density,  as  well  as  the  probability  of  each  point  to  be¬ 
long  to  each  cluster,  see  0  for  details  on  the  theory  and  the 
very  efficient  computation. 

We  tested  the  framework  for  clustering  real  data  in  com¬ 
puter  vision  applications  (scanned  digits,  faces  under  vary¬ 
ing  pose  and  illumination,  different  activities  and  motion  in 
video),  obtaining  state-of-the-art  results  for  the  soft  clustering 
on  non-linear  stratifications.  Here  we  extend  this  framework 
to  the  stratification  defined  by  HARDI,  providing  insight  into 
the  varying  complexity  of  brain  connections. 


3.  BRAIN  MICROSTRUCTURE  COMPLEXITY 

Diffusion-weighted  images  Sj,j  =  1  ...n,  were  acquired  on 
a  4T  Bruker/Siemens  MRI  scanner  using  an  optimized  diffu¬ 
sion  tensor  sequence^]  30  images  were  acquired,  3  with  no 

'imaging  parameters  were:  21  axial  slices  (5  mm  thick),  FOV=23  cm, 
TR/TE=6090/91.7  ms,  0.5  mm  gap,  with  a  128  X  100  acquisition  matrix 
(1.8  mm  in-plane  resolution). 


diffusion  sensitization  So  and  n= 27  diffusion-weighted  im¬ 
ages  at  b=1100  s/mm2.  Gradient  directions  were  evenly  dis¬ 
tributed  on  the  hemisphere. 

Diffusion  tensors  were  computed  with  the  standard  least- 
squares  procedure  using  the  linearized  Stejskal-Tanner  equa¬ 
tion  for  free  anisotropic  diffusion  m  The  signal  attenua¬ 
tion  S(qj ,  t)  obtained  with  the  classical  Pulsed  Gradient  Spin 
Echo  (PGSE)  sequence,  for  a  given  diffusion  gradient  wave- 
vector  q j  and  diffusion  time  t,  is  related  to  the  displacement 
probability  function  of  water  molecules  P( r,  r)  by  a  Fourier 
transform,  S(q7-.  t)  =  SQ  fR3  P(r,  r)e_2’Tiqj  rdr.  ODFs,  de¬ 
fined  as  the  radial  projection  of  the  PDF  P(r,  r),  only  con¬ 
serve  the  angular  information  of  P,  which  is  sufficient  to  re¬ 
cover  the  underlying  orientation  distribution  of  fibers.  ODFs 
were  approximated  by  a  linear  transform  of  the  signals  SH  co¬ 
efficients  S(q,  t)  J3).  We  examined  the  complexity  of  the  sig¬ 
nal  attenuation,  diffusion  tensors,  and  SH  series  coefficients 
of  the  ODFs.  They  respectively  correspond  to  point  clouds 
in  R30,  R6  and  R^1)^2)/2,  for  SH  series  of  order  l.  The 
k- NN  estimation  of  the  local  dimension  only  weakly  exploits 
the  spatial  information  provided  by  the  regular  sampling  of 
points  on  the  acquisition  grid  Q  C  R3,  leaving  room  for  fu¬ 
ture  improvements. 

We  first  studied  how  the  input  model  influences  the  con¬ 
sistency  of  the  dimension/density  estimation  and  clustering. 
In  neighborhoods  of  size  k  =  60,  our  algorithm  analyzed 
the  raw  HARDI  signal  (points  in  R30),  the  4t/l  and  6th  order 
ODFs  (points  respectively  in  R15  and  R28),  and  their  sharp¬ 
ened  versions  (Fig.  [TJ.  ODF  sharpening,  0,  enhances  the 
angular  contrast  of  the  spherical  functions  to  better  differenti¬ 
ate  fiber  compartments  and  potentially  improve  tractography. 
Clusterings  from  4th  and  6th  order  ODFs  are  almost  identi¬ 
cal,  as  30  gradients  may  be  insufficient  to  fit  a  detailed  6th 
order  model. 


Fig.  2.  Increasing  complexity  in  the  forceps  minor. 

However,  clusterings  obtained  from  the  ODFs  are  clearly 
better  than  those  from  the  raw  HARDI  data;  we  can  read¬ 
ily  distinguish  (Fig.  |TJ)  the  gray  matter  in  green,  com- 


Table  1.  Influence  of  the  input  model  on  complexity. 


Color 

Red 

Green 

Blue 

Yellow 

L.  blue 

Purple 

HARDI 

Dim. 

1.55 

4.88 

5.92 

4.32 

5.59 

5.67 

Dens. 

9.27 

16.01 

10.69 

2.42 

13.18 

15.85 

Prob. 

0.65 

0.18 

0.005 

0.002 

0.026 

0.088 

ODF  4 

Dim. 

1.33 

4.53 

4.64 

2.56 

5.32 

5.41 

Dens. 

12.53 

26.70 

20.59 

7.73 

25.96 

28.57 

Prob. 

0.70 

0.16 

0.014 

0.002 

0.038 

0.092 

ODF  6 

Dim. 

1.34 

4.52 

4.64 

2.57 

5.33 

5.40 

Dens. 

12.54 

26.64 

20.57 

7.74 

25.94 

28.49 

Prob. 

0.70 

0.16 

0.014 

0.002 

0.037 

0.092 

plex  white  matter  in  purple  (e.g.,  forceps  minor/major, 
anterior/posterior  corona  radiata  or  superior  longitudinal 
fasciculus),  anisotropic  white  matter  in  light  blue  (e.g., 
genu/splenium  of  the  corpus  callosum  or  internal  capsule), 
and  highly  anisotropic  white  matter  in  blue  (e.g.,  genu  of  the 
corpus  callosum,  cortico-spinal  tract).  However,  the  cluster 
dimensions/densities  do  not  match  the  expected  decrease  in 
complexity  when  going  from  complex  to  very  anisotropic 
white  matter  (Table  [TJ,  perhaps  because  the  raw  HARDI 
signal  is  noisy.  4th  and  6th  order  ODFs  regularize  the  spher¬ 
ical  distribution  by  only  considering  low-frequency  spher¬ 
ical  harmonics  and  impose  some  smoothness  on  the  fitted 
ODFs.  This  translates  into  improved  clustering  results  and 
estimated  dimensions/densities  nicely  matching  the  white 
matter  complexity.  The  complex  white  matter  is  perfectly 
labeled  (purple;  Fig.  |T|),  whereas  some  large  areas  were 
missing  (and  labeled  as  gray  matter)  when  working  on  the 
raw  HARDI  signal.  Highly  anisotropic  areas  (blue)  such  as 
the  genu/splenium  of  the  corpus  callosum  and  cortico-spinal 
tract  are  more  consistently  labeled.  Sharpening  the  4th  order 
ODFs  had  little  effect,  but  decreased  clustering  accuracy  for 
the  6th  order  ODFs,  perhaps  by  enhancing  high-frequency 
noise  in  the  higher-order  model. 

We  compared  our  estimates  to  the  known  complexity  of 
white  matter  configurations,  in  the  genu  of  corpus  callosum 
and  forceps  minor.  Callosal  fibers  are  tightly  packed  at  the 
interhemispheric  plane,  but  diverge  and  mingle  with  other 
fiber  bundles  as  they  progress  toward  the  frontal  lobes.  Our 
method  identifies  and  quantifies  this  increase  in  complexity. 
The  dimension  and  density  of  the  four  submanifolds  increase 
(Fig.  [2]»  as  fibers  leave  the  very  anisotropic  genu  region.  This 
clearly  demonstrates  our  methods  value  for  studying  white 
matter  microstructure. 

Finally,  we  applied  our  algorithm  to  the  6-dimensional 
diffusion  tensor  dataset,  clearly  differentiating  the  gray/white 
matter  and  CSF.  However,  the  difference  in  complexity  be¬ 
tween  gray  and  white  matter  was  low  and  the  CSF  was 
clearly  isolated,  although  it  was  not  when  working  with 
ODFs.  In  gray  and  white  matter,  diffusion  tensors  have  dif¬ 
ferent  anisotropy  but  similar  mean  diffusivity  (Fig.  [3J,  so 
the  model  complexity  stays  roughly  constant.  Conversely, 
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Fig.  1.  Influence  of  the  input  model  on  the  labeling  of  2  axial  slices. 


Fig.  3.  Correct  labeling  of  the  ventricles  from  DTI. 

HARDI  differentiated  gray  matter  from  complex,  anisotropic 
and  even  very  anisotropic  white  matter,  but  could  not  clearly 
label  the  CSF  where  ODFs  are  intrinsically  2D,  they  have 
great  angular  resolution  but  lack  the  amplitude  information 
of  DTI  (see  center  panel.  Fig.  [3j. 

4.  CONCLUDING  REMARKS 

We  presented  a  stratification  learning  method  to  study  the 
non-uniform  complexity  of  HARDI  datasets.  We  labeled 
known  neuro-anatomical  areas  by  examining  the  complexity 
of  the  point  clouds  obtained  from  a  set  of  Orientational  Dis¬ 
tribution  Functions.  Considering  such  high-dimensional  data 
as  belonging  to  a  union  of  manifolds  is  a  natural  and  powerful 
way  to  understand  cerebral  white  matter  connectivity. 
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